nlp_architect.data.cdc_resources.relations.wikipedia_relation_extraction.WikipediaRelationExtraction

class nlp_architect.data.cdc_resources.relations.wikipedia_relation_extraction.WikipediaRelationExtraction(method: nlp_architect.data.cdc_resources.relations.relation_types_enums.WikipediaSearchMethod = <WikipediaSearchMethod.ONLINE: 'online'>, wiki_file: str = None, host: str = None, port: int = None, index: str = None, filter_pronouns: bool = True, filter_time_data: bool = True)[source]
__init__(method: nlp_architect.data.cdc_resources.relations.relation_types_enums.WikipediaSearchMethod = <WikipediaSearchMethod.ONLINE: 'online'>, wiki_file: str = None, host: str = None, port: int = None, index: str = None, filter_pronouns: bool = True, filter_time_data: bool = True) → None[source]

Extract Relation between two mentions according to Wikipedia knowledge

Parameters
  • method (optional) – WikipediaSearchMethod.{ONLINE/OFFLINE/ELASTIC} run against wiki site a sub-set of wiki or on a local elastic database (default = ONLINE)

  • wiki_file (required on OFFLINE mode) – str Location of Wikipedia file to work with

  • host (required on Elastic mode) – str the Elastic search host name

  • port (required on Elastic mode) – int the Elastic search port number

  • index (required on Elastic mode) – int the Elastic search index name

Methods

__init__(method, wiki_file, host, port, …)

Extract Relation between two mentions according to Wikipedia knowledge

extract_aliases(pages1, pages2, titles1, titles2)

Check if input mentions has aliases relation

extract_all_relations(mention_x, mention_y)

Try to find if mentions has anyone or more of the relations this class support

extract_be_comp(pages1, pages2, titles1, titles2)

Check if input mentions has be-comp/is-a relation

extract_category(pages1, pages2, titles1, …)

Check if input mentions has category relation

extract_disambig(pages1, pages2, titles1, …)

Check if input mentions has disambiguation relation

extract_parenthesis(pages1, pages2, titles1, …)

Check if input mentions has parenthesis relation

extract_relation(mention_x, mention_y, relation)

Base Class Check if Sub class support given relation before executing the sub class

extract_sub_relations(mention_x, mention_y, …)

Check if input mentions has the given relation between them

get_phrase_related_pages(mention_str)

Get all WikipediaPages pages related with this mention string

get_supported_relations()

Return all supported relations by this class

is_both_data_or_time(mention1, mention2)

check if both phrases refers to time or date

is_both_opposite_personal_pronouns(phrase1, …)

check if both phrases refers to pronouns

is_part_of_same_name(pages1, pages2)

Check if input mentions has part of same name relation (eg: page1=John, page2=Smith)

is_redirect_same(pages1, pages2)

Check if input mentions has same wikipedia redirect page

static extract_aliases(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has aliases relation

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

  • titles1 – Set[str]

  • titles2 – Set[str]

Returns

RelationType.WIKIPEDIA_ALIASES or RelationType.NO_RELATION_FOUND

extract_all_relations(mention_x: nlp_architect.common.cdc.mention_data.MentionDataLight, mention_y: nlp_architect.common.cdc.mention_data.MentionDataLight) → Set[nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType][source]

Try to find if mentions has anyone or more of the relations this class support

Parameters
  • mention_x – MentionDataLight

  • mention_y – MentionDataLight

Returns

One or more of: RelationType.WIKIPEDIA_BE_COMP,

RelationType.WIKIPEDIA_TITLE_PARENTHESIS, RelationType.WIKIPEDIA_DISAMBIGUATION, RelationType.WIKIPEDIA_CATEGORY, RelationType.WIKIPEDIA_REDIRECT_LINK, RelationType.WIKIPEDIA_ALIASES, RelationType.WIKIPEDIA_PART_OF_SAME_NAME

Return type

Set[RelationType]

static extract_be_comp(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has be-comp/is-a relation

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

  • titles1 – Set[str]

  • titles2 – Set[str]

Returns

RelationType.WIKIPEDIA_BE_COMP or RelationType.NO_RELATION_FOUND

static extract_category(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has category relation

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

  • titles1 – Set[str]

  • titles2 – Set[str]

Returns

RelationType.WIKIPEDIA_CATEGORY or RelationType.NO_RELATION_FOUND

static extract_disambig(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has disambiguation relation

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

  • titles1 – Set[str]

  • titles2 – Set[str]

Returns

RelationType.WIKIPEDIA_DISAMBIGUATION or RelationType.NO_RELATION_FOUND

static extract_parenthesis(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, titles1: Set[str], titles2: Set[str]) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has parenthesis relation

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

  • titles1 – Set[str]

  • titles2 – Set[str]

Returns

RelationType.WIKIPEDIA_TITLE_PARENTHESIS or RelationType.NO_RELATION_FOUND

extract_relation(mention_x: nlp_architect.common.cdc.mention_data.MentionDataLight, mention_y: nlp_architect.common.cdc.mention_data.MentionDataLight, relation: nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType

Base Class Check if Sub class support given relation before executing the sub class

Parameters
  • mention_x – MentionDataLight

  • mention_y – MentionDataLight

  • relation – RelationType

Returns

relation in case mentions has given relation and

RelationType.NO_RELATION_FOUND otherwise

Return type

RelationType

extract_sub_relations(mention_x: nlp_architect.common.cdc.mention_data.MentionDataLight, mention_y: nlp_architect.common.cdc.mention_data.MentionDataLight, relation: nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType) → nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType[source]

Check if input mentions has the given relation between them

Parameters
  • mention_x – MentionDataLight

  • mention_y – MentionDataLight

  • relation – RelationType

Returns

relation in case mentions has given relation or

RelationType.NO_RELATION_FOUND otherwise

Return type

RelationType

Get all WikipediaPages pages related with this mention string

Parameters

mention_str – str

Returns

WikipediaPages

static get_supported_relations() → List[nlp_architect.data.cdc_resources.relations.relation_types_enums.RelationType][source]

Return all supported relations by this class

Returns

List[RelationType]

static is_both_data_or_time(mention1: nlp_architect.common.cdc.mention_data.MentionDataLight, mention2: nlp_architect.common.cdc.mention_data.MentionDataLight) → bool[source]

check if both phrases refers to time or date

Returns

bool

static is_both_opposite_personal_pronouns(phrase1: str, phrase2: str) → bool[source]

check if both phrases refers to pronouns

Returns

bool

is_part_of_same_name(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages) → bool[source]

Check if input mentions has part of same name relation (eg: page1=John, page2=Smith)

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

Returns

bool

static is_redirect_same(pages1: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages, pages2: nlp_architect.data.cdc_resources.data_types.wiki.wikipedia_pages.WikipediaPages) → bool[source]

Check if input mentions has same wikipedia redirect page

Parameters
  • pages1 – WikipediaPages

  • pages2 – WikipediaPage

Returns

bool